R Configuration

Below we display our sessionInfo().

sessionInfo(package=NULL)
## R version 3.3.2 (2016-10-31)
## Platform: x86_64-apple-darwin13.4.0 (64-bit)
## Running under: OS X El Capitan 10.11.6
## 
## locale:
## [1] C
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## loaded via a namespace (and not attached):
##  [1] backports_1.0.5 magrittr_1.5    rprojroot_1.2   tools_3.3.2    
##  [5] htmltools_0.3.5 yaml_2.1.14     Rcpp_0.12.10    stringi_1.1.2  
##  [9] rmarkdown_1.3   knitr_1.15.1    stringr_1.1.0   digest_0.6.11  
## [13] evaluate_0.10
vis0

Introduction

Crime has always been a fascinating topic of discussion. It is human nature to pay attention to gruesome murders and moral corruptness. Why? We don’t know. However, we do know that through media outlets, society has developed an ideology that unemployment leads to higher crime rates. Is this true? Through our R Notebook, we try to show the various relationships between crime and unemployment, as well as independent crime rate and state level monetary analysis in an effort to better understand the truth behind these “ideologies.” We will demonstrate step-by-step instructions on how we created various graphs and charts in Tableau as well as explore the extraordinary visualizations produced through Shiny.

Data

Our employment dataset came from the University of Kentucky Center for Poverty Research (UKCPR). We only focused on the columns that were related through a monetary basis (roughly the first 12 columns). Our crime data came from the U.S. Department of Justice, FBI’s annual Uniform Crime Reporting Statistics. We retrieved the data for all the crime categories listed on the website. Both come from very credible and respected institutions. Thus, the data is very reliable for this project.

ETL

For both datasets we run the relevant ETL operations. We clean the data by first removing special characters (e.g. - ~) from the column names. We then decide which columns are measures and which are dimensions. For dimensions, we change NA to an empty string, change “&” to “and”, change “:” to “;”. We get rid of " and ’. For measures, we change NA to 0. We get rid of all characters except for numbers and the - sign.

## [1] "Population"
## [1] "Employment"
## [1] "Unemployment"
## [1] "Unemployment.rate"
## [1] "Marginally.Food.Insecure"
## [1] "Food.Insecure"
## [1] "Very.Low.Food.Secure"
## [1] "Gross.State.Product"
## [1] "Number.of.low.income.uninsured.children"
## [1] "Percent.Low.Income.Unisured.Children"
## [1] "Personal.income"
## [1] "Workers..compensation"
##       state_name      state       year      Population      
##  Alabama   :  5   1      :  5   2010:51   Min.   :  564516  
##  Alaska    :  5   10     :  5   2011:51   1st Qu.: 1623796  
##  Arizona   :  5   11     :  5   2012:51   Median : 4382667  
##  Arkansas  :  5   12     :  5   2013:51   Mean   : 6158836  
##  California:  5   13     :  5   2014:51   3rd Qu.: 6789176  
##  Colorado  :  5   14     :  5             Max.   :38792291  
##  (Other)   :225   (Other):225                               
##    Employment        Unemployment     Unemployment.rate
##  Min.   :  283744   Min.   :  11152   Min.   : 2.700   
##  1st Qu.:  730472   1st Qu.:  54283   1st Qu.: 6.000   
##  Median : 1877812   Median : 157581   Median : 7.300   
##  Mean   : 2796855   Mean   : 244360   Mean   : 7.368   
##  3rd Qu.: 3235551   3rd Qu.: 288906   3rd Qu.: 8.650   
##  Max.   :17348645   Max.   :2244326   Max.   :13.500   
##                                                        
##  Marginally.Food.Insecure Food.Insecure    Very.Low.Food.Secure
##  Min.   :11.71            Min.   : 7.883   Min.   :2.036       
##  1st Qu.:22.16            1st Qu.:13.051   1st Qu.:4.402       
##  Median :25.47            Median :15.394   Median :5.378       
##  Mean   :25.45            Mean   :15.362   Mean   :5.368       
##  3rd Qu.:28.50            3rd Qu.:17.266   3rd Qu.:6.159       
##  Max.   :41.08            Max.   :25.224   Max.   :9.197       
##                                                                
##  Gross.State.Product Number.of.low.income.uninsured.children
##  Min.   :  26570     Min.   :  1.00                         
##  1st Qu.:  76363     1st Qu.: 14.00                         
##  Median : 190304     Median : 44.00                         
##  Mean   : 315502     Mean   : 83.01                         
##  3rd Qu.: 404486     3rd Qu.: 85.00                         
##  Max.   :2311616     Max.   :843.00                         
##                                                             
##  Percent.Low.Income.Unisured.Children Personal.income    
##  Min.   : 0.700                       Min.   :2.561e+07  
##  1st Qu.: 3.000                       1st Qu.:6.498e+07  
##  Median : 4.100                       Median :1.664e+08  
##  Mean   : 4.756                       Mean   :2.685e+08  
##  3rd Qu.: 6.100                       3rd Qu.:3.427e+08  
##  Max.   :15.000                       Max.   :1.978e+09  
##                                                          
##  Workers..compensation
##  Min.   :   7907      
##  1st Qu.:  36316      
##  Median : 110973      
##  Mean   : 298914      
##  3rd Qu.: 246610      
##  Max.   :2443512      
## 
##     state_name state year Population Employment Unemployment
## 5   California     5 2010   37334079   16091945      2244326
## 56  California     5 2011   37700034   16258133      2156967
## 107 California     5 2012   38056055   16602672      1921121
## 158 California     5 2013   38414128   16958735      1665590
## 209 California     5 2014   38792291   17348645      1406380
##     Unemployment.rate Marginally.Food.Insecure Food.Insecure
## 5                12.2                 29.61473      18.56144
## 56               11.7                 31.61105      19.07340
## 107              10.4                 27.89361      16.48582
## 158               8.9                 25.53890      15.72165
## 209               7.5                 24.16489      13.69295
##     Very.Low.Food.Secure Gross.State.Product
## 5                5.71067             1953411
## 56               6.10045             2030468
## 107              5.79252             2125717
## 158              5.06253             2202678
## 209              4.15612             2311616
##     Number.of.low.income.uninsured.children
## 5                                       763
## 56                                      770
## 107                                     653
## 158                                     488
## 209                                     341
##     Percent.Low.Income.Unisured.Children Personal.income
## 5                                    7.8      1617134250
## 56                                   7.8      1727433579
## 107                                  6.7      1838567162
## 158                                  5.0      1861956514
## 209                                  4.0      1977923740
##     Workers..compensation
## 5                 2067143
## 56                2062255
## 107               2042670
## 158               1990609
## 209               2072792
## [1] "Population"
## [1] "Violent.crime.total"
## [1] "Murder.and.nonnegligent.Manslaughter"
## [1] "Legacy.rape..1"
## [1] "Revised.rape..2"
## [1] "Robbery"
## [1] "Aggravated.assault"
## [1] "Property.crime.total"
## [1] "Burglary"
## [1] "Larceny.theft"
## [1] "Motor.vehicle.theft"
## [1] "Violent.Crime.rate"
## [1] "Murder.and.nonnegligent.manslaughter.rate"
## [1] "Legacy.rape.rate..1"
## [1] "Revised.rape.rate..2"
## [1] "Robbery.rate"
## [1] "Aggravated.assault.rate"
## [1] "Property.crime.rate"
## [1] "Burglary.rate"
## [1] "Larceny.theft.rate"
## [1] "Motor.vehicle.theft.rate"
##         State       Population       Violent.crime.total
##  Alabama   :  5   Min.   :  564554   Min.   :   622     
##  Alaska    :  5   1st Qu.: 1623654   1st Qu.:  5386     
##  Arizona   :  5   Median : 4379730   Median : 15452     
##  Arkansas  :  5   Mean   : 6157436   Mean   : 23812     
##  California:  5   3rd Qu.: 6784338   3rd Qu.: 27735     
##  Colorado  :  5   Max.   :38802500   Max.   :164133     
##  (Other)   :225                                         
##  Murder.and.nonnegligent.Manslaughter Legacy.rape..1 Revised.rape..2  
##  Min.   :   7.0                       Min.   :  99   Min.   :  110.0  
##  1st Qu.:  51.5                       1st Qu.: 533   1st Qu.:  772.8  
##  Median : 160.0                       Median :1190   Median : 1592.0  
##  Mean   : 285.6                       Mean   :1651   Mean   : 2258.2  
##  3rd Qu.: 389.0                       3rd Qu.:2012   3rd Qu.: 2518.0  
##  Max.   :1884.0                       Max.   :8398   Max.   :11527.0  
##                                                      NA's   :153      
##     Robbery      Aggravated.assault Property.crime.total    Burglary     
##  Min.   :   53   Min.   :  432      Min.   :   9551      Min.   :  1689  
##  1st Qu.: 1039   1st Qu.: 3376      1st Qu.:  42299      1st Qu.:  8058  
##  Median : 3689   Median : 9550      Median : 125377      Median : 26196  
##  Mean   : 6862   Mean   :14761      Mean   : 172925      Mean   : 39707  
##  3rd Qu.: 7358   3rd Qu.:18087      3rd Qu.: 204282      3rd Qu.: 47990  
##  Max.   :58116   Max.   :95877      Max.   :1049465      Max.   :245767  
##                                                                          
##  Larceny.theft    Motor.vehicle.theft Violent.Crime.rate
##  Min.   :  7273   Min.   :   244      Min.   :  99.3    
##  1st Qu.: 29452   1st Qu.:  3792      1st Qu.: 256.4    
##  Median : 89103   Median :  8626      Median : 329.5    
##  Mean   :119222   Mean   : 13996      Mean   : 372.7    
##  3rd Qu.:143460   3rd Qu.: 15407      3rd Qu.: 449.4    
##  Max.   :654626   Max.   :168608      Max.   :1326.8    
##                                                         
##  Murder.and.nonnegligent.manslaughter.rate Legacy.rape.rate..1
##  Min.   : 0.900                            Min.   : 9.70      
##  1st Qu.: 2.500                            1st Qu.:23.90      
##  Median : 4.200                            Median :29.00      
##  Mean   : 4.401                            Mean   :30.74      
##  3rd Qu.: 5.600                            3rd Qu.:36.05      
##  Max.   :21.800                            Max.   :89.10      
##                                                               
##  Revised.rape.rate..2  Robbery.rate    Aggravated.assault.rate
##  Min.   : 13.30       Min.   :  9.10   Min.   : 60.0          
##  1st Qu.: 32.08       1st Qu.: 54.60   1st Qu.:153.4          
##  Median : 38.00       Median : 85.10   Median :218.5          
##  Mean   : 41.34       Mean   : 96.13   Mean   :236.9          
##  3rd Qu.: 47.55       3rd Qu.:117.95   3rd Qu.:296.2          
##  Max.   :125.50       Max.   :715.00   Max.   :626.1          
##  NA's   :153                                                  
##  Property.crime.rate Burglary.rate    Larceny.theft.rate
##  Min.   :1524        Min.   : 257.2   Min.   :1161      
##  1st Qu.:2260        1st Qu.: 439.2   1st Qu.:1606      
##  Median :2726        Median : 568.3   Median :1938      
##  Mean   :2802        Mean   : 621.6   Mean   :1972      
##  3rd Qu.:3305        3rd Qu.: 796.5   3rd Qu.:2289      
##  Max.   :5182        Max.   :1157.6   Max.   :4082      
##                                                         
##  Motor.vehicle.theft.rate   Year   
##  Min.   : 38.9            2010:51  
##  1st Qu.:138.4            2011:51  
##  Median :198.2            2012:51  
##  Mean   :208.1            2013:51  
##  3rd Qu.:253.1            2014:51  
##  Max.   :835.7                     
## 
##     State Population Violent.crime.total
## 44  Texas   25253466              113231
## 95  Texas   25631778              104734
## 146 Texas   26060796              106475
## 197 Texas   26505637              108757
## 248 Texas   26956958              109414
##     Murder.and.nonnegligent.Manslaughter Legacy.rape..1 Revised.rape..2
## 44                                  1249           7622              NA
## 95                                  1130           7486              NA
## 146                                 1148           7715              NA
## 197                                 1140           7610           10456
## 248                                 1184           8236           11393
##     Robbery Aggravated.assault Property.crime.total Burglary Larceny.theft
## 44    32843              71517               951246   228597        654626
## 95    28620              67498               892810   215755        613131
## 146   30385              67227               876459   205002        606425
## 197   31810              65351               862289   191062        605440
## 248   31181              65656               813934   169234        576154
##     Motor.vehicle.theft Violent.Crime.rate
## 44                68023              448.4
## 95                63924              408.6
## 146               65032              408.6
## 197               65787              410.3
## 248               68546              405.9
##     Murder.and.nonnegligent.manslaughter.rate Legacy.rape.rate..1
## 44                                        4.9                30.2
## 95                                        4.4                29.2
## 146                                       4.4                29.6
## 197                                       4.3                28.7
## 248                                       4.4                30.6
##     Revised.rape.rate..2 Robbery.rate Aggravated.assault.rate
## 44                    NA        130.1                   283.2
## 95                    NA        111.7                   263.3
## 146                   NA        116.6                   258.0
## 197                 39.4        120.0                   246.6
## 248                 42.3        115.7                   243.6
##     Property.crime.rate Burglary.rate Larceny.theft.rate
## 44               3766.8         905.2             2592.2
## 95               3483.2         841.7             2392.1
## 146              3363.1         786.6             2327.0
## 197              3253.2         720.8             2284.2
## 248              3019.4         627.8             2137.3
##     Motor.vehicle.theft.rate Year
## 44                     269.4 2010
## 95                     249.4 2011
## 146                    249.5 2012
## 197                    248.2 2013
## 248                    254.3 2014

Tableau Visualizations

Crosstab: Map of Robbery vs Unemployment

vis1
vis2
vis3
vis4
vis5


Description:

This is a map of Robbery vs Unemployment per year (between 2010 and 2014). The darker the color, the higher rate of robbery per unemployment there is for each state. Notice how as we go from 2010 to 2014, the Robbery vs Unemployment rate grows spreads from surrounding areas of Nevada, DC, and Louisiana, like a virus!

Steps:

  1. Double Click on State to show the U.S. Map
  2. Click on Dimension and Create a Calculated FIeld: the formula is: SUM([Robbery])/SUM([Unemployment])
  3. Drag “Year1” to pages
  4. Drag the calculated field from step 2 onto Color, customize step/color to personal taste.
  5. Drag “State Name” for label.
  6. Drag “Population” for detail.
  7. Drag “State” to Filter, and filter out Hawaii and Alaska (We are just going to look at mainland USA)

Histogram: Burglary Rate vs State

vis6
vis7
vis8
vis9
vis10


Description:

This is a histogram, with dots showing the burglary rate (# of burglaries/100k people) per year (between 2010 and 2014). The line represents the average burglary rate. Notice how as we go from 2010 to 2014, the burglary rates decrease significantly.

Steps:

  1. Click on Analysis and Unselect “Aggregated Measures”
  2. Drag “State Name” to Columns
  3. Drag “Burglary Rate” to Rows
  4. Drag “Year” to Pages
  5. Drag Measure Names to “Color”, change color as desired
  6. Go to Analytics and drag Average Line onto graph and select “Pane Line”

Scatterplot: Aggravated Assaults and Robbery

vis11
vis12
vis13
vis14
vis15


Description:

This is a scatterplot showing the relationship between Aggravated Assaults and Robbery per year (between 2010 and 2014). We can see a strong positive linear association between Robberies and Aggravated Assaults through the trend line displayed.

Steps:

  1. Click on Analysis and unselect “Aggregated Measures”
  2. Drag “Aggravated Assaults” to Rows
  3. Drag “Robberies” to Columns
  4. Drag “Year” to Pages
  5. Right click on the graph and click “Trend Lines,” unselect confidence band. The slope seems to be nearly the same for 2010 to 2013. However, the slope is a little steeper in the year 2014, which means that the aggravated assaults is higher than the previous years.

Boxplot: Year vs Property Crime Rate

vis16


Description:

This is a boxplot of property crime rate based on each year. We can see that the property crime rate is slowly decreasing (median wise) as the years progress. DC has the highest property crime rate for all years. They only become outliers for the years: 2012, 2013, and 2014. This means the property crime rate actually decreases for all the other states per year, which makes DC an outlier.

Steps:

  1. Click on Analysis and unselect “Aggregated Measures”
  2. Drag “Year” to columns
  3. Drag “Property Crime Rate” to rows
  4. Drag “Year” to color
  5. Drag “State” to Detail
  6. Go to “Analytics” and drag average line to the chart and select Pane Line
  7. Click on Box Plot on the righthand side.

Barchart: State and Year vs Burglaries

vis17
vis18


Description:

This is a bar chart showing the number of burglaries on each state per year. We manually filtered by big east coast states and west coast states. Notice how West Coast burglaries is significantly higher (around 30k more) than the East Coast burglaries. This could be because of the small number of states classified as “west coast,” giving a large standard deviation.

Steps:

  1. Click on Analysis and unselect “Aggregated Measures”
  2. Drag “Year” and “State” to Rows
  3. Drag “Burglaries” to Columns
  4. Right click on “Burglaries” and select “Measures” then “Average”
  5. Drag “State” to Filter and select designated states.
  6. Go to “Analytics” and drag average line to the chart and select Pane Line

Shiny Application


Crosstab: Crime to employment ratio

vis24


Description:

The Crosstab plots a Crime to Employment Ratio and analyzes every state over the course of 2010-2014. Results indicate that D.C has the highest crime rate by far. Despite changing the sliders multiple times, the area stays with a high KPI. Most nations experience a medium KPI under most settings. The most peaceful nations seem to be Vermont and Wyoming. Vermont especially keeps a very low KPI in almost all settings.


Histogram: Employees aggregated every year

vis22


Description:

The histogram outlines number of employees aggregate every year from 2010 to 2014. Each year has a different histogram. The bucket size for the histograms are 200000 because of how varied populations are across the states. The biggest increase in overall employment happened between 2010 to 2011, though this growth was slow. After that employment seems to have been increasing very slowly the next 2 years.


Scatterplot: Food Insecurity vs Violent Crime Rate

vis23


Description:

The scatter plots a food insecurity ratio on the Y axis and the violent crime rate on the X axis. The only real outlier in terms of results is California which experiences a lower food insecurity but the highest crime rate by far. In terms of interesting results, Mississippi experiences a huge drop in food insecurity after 2010 and then slowly goes back up starting in 2011. It’s interesting because it starts off as the highest in food insecurity and then goes down by a lot, but the violent crime rate stays the same. 2014 seems to indicate that food insecurity on average has gone up by a lot more than in past years. There are very few states with a 10 or less food insecurity ratio.


Boxplot: State vs GSP

vis21


Description:

The boxplot outlines each independent state and their GSP (Gross State Products) over 2010-2014. California has by far the largest GSP with its lowest value of 1953411 being higher than any other state’s GSP . Texas comes in second place followed by New York. These states also have the biggest gaps between the top and bottom of their boxes. The smaller GSP’s are all much more knit together. Vermont has all 5 of the lowest GSP values, but it’s GSP has been going up over the past few years.


Barchart: Violent Crimes Per Year

vis19
vis20


Description:

The black on the barchart represents the total number of violent crimes done. Each state has an independent graph showcasing the number of crimes done from the years 2010-2014. The red line represents the mean number of violent crimes done in that particular state over the timespan. The blue line represents the difference between the mean and the respective year. The range between the state with the least number of crimes and the highest number of crimes was shocking. California has averaged close to 158000 crimes a year while Vermont has averaged only around 800 a year. Outside of that, the results seemed to indicate that murder rate has stayed fairly consistent in most of the US around the time period.


Conclusion

Through the data, one can see many interesting observations. From Robberies being spread from surrounding areas of Nevada, DC, and Louisiana to other states to how west coast burglaries are significantly more than east coast burglaries, we are able to get a broad idea of the rate at which crimes are happening in recent years. Regardless of how a state is doing, in terms of monetary stability or unemployment rates, we have observed burglary rates and property crime rates decreasing throughout the years. It is safe to assume, that as the years increase, given all situations are static, that the following years to come won’t be a “Prime Time for Crime.”